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Huffman source coder 

This example shows how a Huffman coder allocates variable length 
codewords to the transmitted symbols depending on their probability of 
occurence. 


Source coding 


Huffman coding deploys variable length coding and then allocates the 
longer codewords to less frequently occurring symbols and shorter 
codewords to more regularly occurring symbols. By using this technique it 
can minimize the overall transmission rate as the regularly occurring 
symbols are allocated the shorter codewords. 


Simple source coding 


Symbol Probability 
A 0.10 
B 0.18 
C 0.40 
D 0.05 
E 0.06 
F 0.10 


G 0.07 


H 0.04 
8-symbol signal to be encoded 


We have to start with knowledge of the probabilities of occurrence of all the 
symbols in the alphabet. The table above shows an example of an 8-symbol 
alphabet, A...H, with the associated probabilities for each of the eight 
individual symbols. 


The entropy of this source is:- 
H- 
~0.10log, (0.10) —0.18log, (0.18) — 0.40 log, (0.40) 
0.05 log, (0.05) — 0.06 log, (0.06) — 0.10 log, (0.1000) 
0.07 log, (0.07) —0.04 log, (0.04) = 2.5524 _ bits/symbol 


Source encoder entropy calculation 


[link] shows that the entropy of this source data is 2.5524 bits/symbol. 


Symbol Code 


A 000 
B 001 
C 010 
D 011 
E 100 
F 101 
G 110 
H 111 


Simple fixed length (3-bit) encoder 


This shows the application of very simple coding where, as there are 8 
symbols, we adopt a 3-bit code. [link] shows that the entropy of such a 
source is 2.5524 bit/symbol and, with the fixed 3 bit/symbol length 
allocated codewords, the efficiency of this simple coder would be only 
2.5524/3.0 = 85.08%, which is a rather poor result. 


Huffman coding 


This is a variable length coding technique which involves two processes, 
reduction and splitting. 


Reduction 


We start by listing the symbols in descending order of probability, with the 
most probable symbol, C, at the top and the least probable symbol, H, at the 
foot, see left hand side of [link]. Next we reduce the two least probable 
symbols into a single symbol which has the combined probability of these 
two symbols summed together. Thus symbols H and D are combined into a 
single (i.e. reduced) symbol with probability 0.04 + 0.05 = 0.09. 


Now the symbols have to be reordered again in descending order of 
probability. As the probability of the new H+D combined symbol (0.09) is 
no longer the smallest value it then moves up the reordered list as shown in 
the second left column in [link]. 


This process is progressively repeated as shown in [link] until all symbols 
are combined into a single symbol whose probability must equal 1.00. 


Cc 0.40 0.40 0.40 0.40 0.40 0.40 0.60 EO 1.00 
B 0.18 0.18 0.18 0.19 0.23 0.37 0.40 

A 0.10 0.10 0.13 0.18 0.19 0.23 

F 0.10 0.10 0.10 0.13 0.18 

G 0.07 0.09 0.10 0.10 

E 0.06 0.07 0.09 

D 0.05 0.06 

H 0.04 


Huffman coder reduction process 


Splitting 


The variable length codewords for each transmitted symbol are now derived 
by working backwards (from the right) through the tree structure created in 
[link], by assigning a 0 to the upper branch of each combining operation 
and a 1 to the lower branch. 


The final “combined symbol” of probability 1.00 is thus split into two parts 
of probability 0.60 with assigned digit of 0 and another part with probability 
0.40 with assigned digit of 1. This latter part with probability 0.40 and 
assigned digit of 1 actually represents symbol C, [link]. 


The “combined symbol!” with probability 0.60 (and allocated first digit of 0) 
is now split into two further parts with probability 0.37 with an additional 
or second assigned digit of 0 (i.e. its code is now 00) and another part with 
the remaining probability 0.23 where the additional assigned digit is 1 and 
associated code will now be 01. 


Cc 0.40 0.40 0.40 0.40 0.40 0.40 0.60 1.00 
1 vi 1 1 1 | 0 
B 0.18 0.18 0.18 0.19 0.23 0.37 0.40 


001 001 001 000 Ol 00 1 
A 0.10 0.10 0.13 0.18 0.19 0.23 
Oll Oll 010 O0l 000 Ol 
F 0.10 0.10 0.10 0.13 0.18 
0000 0000 O11 010 001 
G 0.07 0.09 0.10 0.10 
Il 
ey se Oe 2 New digits added to the right 
E 0.06 0.07 0.09 
0101 0100 0001 
D 0.05 0.06 
00010 0101 
H 0.04 
00011 


Huffman coder splitting process to generate the variable length 
codewords and allocate these depending on symbol probabilities. 


This process is repeated by adding each new digit after the splitting 
operation to the right of the previous one. Note how this allocates short 
codes to the more probable symbols and longer codes to the less probable 
symbols, which are transmitted less often. 


Symbol Code 


A 011 


C 1 

D 00010 
E 0101 
F 0000 
G 0100 
H 00011 


Huffmann coded variable length symbols 


Code efficiency 


[link] summarises the codewords now allocated to each of the transmitted 
symbols A...H and also calculates the average length of this source coder 
as 2.61 bits/symbol. Note the considerable reduction from the fixed length 
of 3 in the simple 3-bit coder in earlier table. 
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The average length of a 
codeword for this code is:- 


L=0.10x3+0.18x3+ 
0.40x1+0.05x5+0.06x 4+ 
0.10x44+0.07x4+0.04x5 
= 2.61 binary digits/symbol 


A 

B 

C 
D 
E 
FB 
G 
H 


Summary of allocated codewords for each symbol, A ...H, and 
calculation of average length of transmitted codeword. 


Now recall from [link] that the entropy of the source data was 2.5524 
bits/symbol and the simple fixed length 3-bit code in the earlier table, with 
a length of 3.00 which gave an efficiency of only 85.08%. 


The efficiency of the Huffman coded data with its variable length 
codewords is therefore 2.5524/2.62 = 97.7% which is a much more 
acceptable result. 


If the symbol probabilities all have values 1/( ) which are integer powers 
of 2 then Huffmann coding will result in 100% efficiency. 


Note:This module has been created from lecture notes originated by P M 
Grant and D G M Cruickshank which are published in I A Glover and P M 
Grant, "Digital Communications", Pearson Education, 2009, ISBN 978-0- 
273-71830-7. Powerpoint slides plus end of chapter problem 
examples/solutions are available for instructor use via password access at 
http://www.see.ed.ac.uk/~pmg/DIGICOMMS/ 


Block FECC coding 

Inroduces the concept of forward error correcting codes and in particular 
those based on block coding techniques. Includes partiy check equations 
and the parity check matrix. 


Block FECC coding 


Forward error correcting coding (FECC) 


Block codes are one example of the forward error correcting coding 
(FECC) technique where we encode the signal by adding additional bits or 
digits of redundant data so that the decoder is then able to correct most of 
the errors which are introduced by transmission through a noisy channel. 
FECC was invented for deep space probes where the extremely long 
transmission propagation path loss results in received data with particularly 
low signal to noise ratio as the modest transmitter power is limited by the 
solar panel outputs. 


As we are adding additional bits to generate each codeword this is a 
systematic encoder as the information data bits are included directly within 
the codewords. The additional bits required for the transmission of the 
redundant information increases the data rate which will consume more 
bandwidth if we wish to maintain the same throughput, but, if we seek to 
obtain low error rates, then this trade-off is usually acceptable. 


10° 


Coded 
10"! 
1077 
P, 
Uncoded 
10° 
ie 
10° 
012345678 9 10 
E;,/No (dB) 


Error probability against received noise level for FECC and uncoded 
data transmissions 


[link] shows the typical error rate performance for uncoded data compared 
with FECC data. It plots the bit error probability, P, , against a rn is the 


measure of the energy per bit to the noise power spectral density ratio and is 
the normally used signal to noise ratio ratio measure on these error rate 
plots. 


FECC is used widely in compact discs (CD), computer storage and data 
transmission, all manor of radio links, data modems, video, TV and 
cellphone transmissions, space communications etc. Note in [link] the 
ability of FECC to achieve a much lower error rate than for the uncoded 
data transmissions at low bit error probability, P, . 


ASCII coding 


In some computer communication systems, information is sent as 7-bit 
ASCII codes with a parity check bit added on the end. Using even parity the 
7-bit all zero ASCII code 0000000 expands into 00000000 while 0000001 
codes to 00000011. This (and all other cases) thus has a binary digit 
difference or Hamming Distance of 2. [link] shows that we can use this to 
detect 1 (or an odd number of) errors. 


Example: Even parity 


x ODP pepo] v 
od Ce OC 


¢ Error 


Detects odd number of errors 


ASCII code example where received codeword has single error in the 
oth bit position 


The block length is then n = 8 and the number of information bits k = 7. 
This generally assists with error detection but is is insufficiently robust or 


redundant to achieve an error correction capability as the coding rate is only 
7/8. 


The minimum distance in binary digits between any two codewords is 
known as the minimum Hamming Distance, Dmin , which is 2 for the case 
of odd or even parity check in ASCII data transmission. We can then 
calculate the error detecting and correcting power of a code from the 
minimum distance in bits between error free blocks or codewords, see error 
correction capability module. 


Although we shall look exclusively at coding schemes for binary systems, 
error correcting and detecting coding is not confined to binary digits. For 
example the ISBN numbers used on books have a checksum appended to 
them and these are calculated via modulo 11 arithmetic. 


Block code construction 


Block codes collect or arrange incoming information carrying data into 
groups of k binary digits and add coding (i.e. parity) bits to increase the 
coded block length up to n bits, where n>k. 


k information Block n encoded 
di gits coder di gits 


Rate, R= k/n 


Information Parity 
digits digits 


n digit codeword 


Block coder with k information digits and appended parity check bits 


The coding rate R is simply the ratio of data or information carrying bits to 
the overall block length, k/n. The number of parity check (redundant) bits is 
therefore n — k, [link]. This block code is usually described as a (n, k) code. 


Block code example 


Suppose we want to code k = 4 information bits into a n = 7 bit codeword, 
giving a coding rate of 4/7. Code design is performed by using finite field 
algebra to achieve linear codes. We can achieve this (7, 4) block code using 
3 input exclusive or (EX - OR) gates to form the three even parity check 
bits, P|, P. and P3;, from the 4 information carrying bits, J;...4, as shown 
in [Link]. 


Logic gate representation for (n, k) block coder where k = 4 
information bits and n = 7 encoded block length (i.e. (7, 4) coder) 


This circuitry can be represented by the logic gates in [link] or written 
either as a set of parity check equations or the corrresponding parity check 
matrix H, as in [link]. 


P,=1Xh ®@0xXh@1XLO1X 
P2=1XO1xhO0xheOlxh 
P3=1XhO1lxhOlxhooxh 


Ld k 2S Ea 
H= 1101 : 010 
1110 : 001 
Vi . 
Parity check equations Identity matrix 


Parity check bit computation and corresponding H matrix 
representation for (7, 4) block encoder 


Remember here that the “cross-in-the-circle” symbol indicates a bitwise 
exclusive-or (EX — OR) operation. This H matrix can also be used to 
directly generate codewords from the information bits via a closely related 
G matrix. 


This is an example of a systematic code, where the data is included directly 
within the codeword. Convolutional FECC, see later module, is an example 
of a non-systematic coder where we do not explicitly include the 
information carrying data within the transmissions, although the transmitted 
coded data is derived from the information data. 


Note:This module has been created from lecture notes originated by P M 
Grant and D G M Cruickshank which are published in I A Glover and P M 
Grant, "Digital Communications", Pearson Education, 2009, ISBN 978-0- 
273-71830-7. Powerpoint slides plus end of chapter problem 
examples/solutions are available for instructor use via password access at 
http://www.see.ed.ac.uk/~pmg/DIGICOMMS/ 


Block code performance 
This module explores with nearest neighbour decoding the limits or bounds 
on error correcting performance of the (n,k) block coder. 


Block code error correction capability 


Hamming distance 


Consider two distinct five digit codewords C1 = 00000 and C2 = 00011. 
These have a binary digit difference (or Hamming distance) of 2 in the last 
two digits. The minimum distance in binary digits between any two 
codewords is known as the minimum Hamming distance, Din 


For block codes the minimum Hamming distance or the smallest difference 
between the digits for any two codewords in the complete code set, Dyin , 
is the property which controls the error correction performance. We can 
thus calculate the error detecting and correcting power of a code from the 
minimum distance in bits between the codewords. 


Thus for a code with a minimum distance D,,;, = 3 then this code can be 
used to correct: 


A codeword where the minimum distance is 3 can be 
used to correct: 


D.-1 3-1 
a — = ——=lemnor 


or detect 
D_, ~1=3-1=2 errors 


but it cannot perform both of these functions 
simultaneously 


Relationship between Dmin and error detection OR correction 
capability (but not both simultaneously) 


Note in the earlier example of two five digit codewords C1 = 00000 and C2 
= 00011 which had a Hamming distance of 2 there is only one codeword 
(e.g. A = 00001 or B = 00010) which lies inbetween these two codewords. 
Now if there was an error result (e.g. A = 00001) we cannot tell whether it 
came from C1 or C2 so we can thus only used this to detect that an error has 
occurred. 


If the two five digit codewords had been C1 = 00000 and C2 = 00111, 
which have a Hamming distance of 3, there are then two words which lie 
inbetween these codewords (e.g. A = 00001 and B = 00011) and these can 
thus be used EITHER to detect two errors without any correction capability 
OR if detection is not required they can used to correct a single error (e.g. 


C1 = 00000 distorted into A = 00001 or C2 = 00111 distorted into B = 
00011), [link]. 


When performing the correction operation we require to insert the decision 
boundary as shown in Example 1 below. If in this example we had wished 
to perform detection only as shown in the lower part of [link] then we 
would ignore whether the received code A = 00001 resulted from a single 
error from a C1 transmission or a double error from a C2 code transmission 
and only identify it as a detected error. 


Example 1 — error correction 
C1 = 00000 


A = 00001 


B = 00011 
C2 = 00111 


This explains further the detailed operation of the equations in [link] where 
detection only operation does not require the decision boundary to aid 
identification of the origination of the error. 


Block error probability and correction capability 


If we have an error correcting code which can correct R’ errors, than the 
probability of a codeword not being correctable is the probability of having 
more than R’ errors in n digits. The probability of having more than R’ 
errors is given in [link]. We can this calculate this probability by summing 
all the induvidual error probabilities up to and including R' errors in the 
block. 


More than R’ errors in 7 binary digits? 


For a code which can correct R’ errors then, the 
probability of an uncorrectable error, is the probability 
of having more than R’ errors inn digits as given by: 


i 
PC R' errors) =1-— > PC j errors) 


J-0 


Correction of more than R' errors in an n digit block 


The probability of j errors occurring in an n digit codeword is given in 
[link]. P. is the probability of error in a single binary digit and n is the 
block length. [link] also shows how to calculate the nCj term representing 
all the possible number of ways or error positions that j errors can occur 
within a block of length n binary digits. 


The probability of / errors in n digit codeword is: 
P(j errors) = (P,)/(1- P,)"" x "C, 


P., is the probability of error in a single binary digit and 
nis the block length. "C; is the number of ways of 
choosing / error digit positions within a block of length 
n binary digits: 


"C n\ 


i j\(n—j)! 


where ! denotes the factorial operation. 


Probability of j errors occurring an an n-digit codeword 


Example 2: If we have an error correcting code which can correct 3 errors 
within a block length n of 10, what is the probability that the code cannot 
correct a received block if the per digit error probability is P. = 0.01? 


Solution: The code cannot correct the received block if there are more than 
3 errors. Thus: 


P > 3 errors = 1 - P(0 errors) - P(1 error) - P(2 errors) - P(3 errors). 


[link] shows the component parts of this calculation. 


| 
P(0 errors) = 0.01°0.99"" —— = 0.9043821 


P(1 error) = 0.01'0.99? — = 0.0913517 


P(2 errors) = 0.0170.99 ——" = 0.0041523 


P(3 errors) = 0.01°0.99' = 0.0001118 


Calculation of zero, 1, 2 and 3 errors in a 10-digit codeword with a 
per-digit P. of 0.01 


Thus the probability that the code cannot correct a received block is then: 
1 - 0.9043821 - 0.0913517 - 0.0041523 - 0.0001118 = 0.0000021. 


This illustrates that the very low overall error remaining after correction of 
three errors is much less than the original probability of error in a single bit, 
P. = 0.01. Note also the need for high precision arithmetic (it may be an 
eight digit calculator is not good enough to calculate the answer to more 
than 1 significant figure). Note also in [link] the much lower probability of t 
+ 1 errors occuring, compared to t errors, as is implied in FECC. 


Group codes 


Group codes are a special kind of block codes. They comprise a set of 
codewords, C1 ... CN, which contain the all zeros codeword (e.g. 00000) 
and exhibit a special property called closure. This property means that if 
any two valid codewords are subject to a bit wise EX — OR operation then 
they will produce another valid codeword in the set. 


The closure property means that to find the minimum Hamming distance, 
see below, all that is required is to compare all the remaining codewords in 
the set with the all zeros codeword instead of comparing all the possible 
pairs of codewords. 


The saving gets bigger the longer the codeword. For example a code set 
with 100 codewords will require 100 comparisons for a Group code design, 
compared with 100+99+98+...+2+1, for a non-group code! 


In Group codes the Dyin calculation is further simplified into calculating 
the minimum codeword weight or minimum number of 1 digits ina 
codeword in the set. 


Nearest neighbour decoding 


Nearest neighbour decoding assumes that the codeword nearest in 
Hamming distance to the received word is what was transmitted, as shown 
in Example 1 above. This inherently contains the assumption that the 
probability of a small number of t errors is greater than the probability of 
the larger number of t+1 errors, i.e that P. is small. 


A nearest neighbour decoding table for a (n, k) = (5, 2) i.e. a 5-digit group 
code is shown in [link]. Recall that for an n = 5 bit codeword there are 2° = 
32 unique patterns generated by all the possible combinations of the 5 
digits. 


Codewords 11011 


Single bit errors O1011 
(correctable) 10011 
11111 


11001 


OOOO1L | 11101 


pee a a 10001 01101 [10110 |01010 
( een utnot Tioo10 o1110 |10101 [01001 
correctable 


Nearest neighbour decoding table for 5 bit code with 4 codewords 
implying 2 information bits 


[link] starts by forming a table with the 4 codewords across the top row. All 
the single error patterns, which each only differ by one bit from each of the 
transmitted codewords, can be readily and uniquely assigned back to an 
error free codeword. Thus the next 5 rows represent these single errors in 
position 1 through 5 in each of the 4 codewords. Now we have a table up to 
this point with a total of 4 x 6 = 24 unique entries. Therefore this code is 
capable of correcting all these single errors. 


There are also eight remaining codes or table entries as 32 - 24 = 8 and 
these represent double error patterns which, as can be seen, lie an equal 
Hamming distance from at least 2 of the initial 4 codewords in the top row. 
Note for example errors in the first two digits of the 00000 codeword result 
in us receiving 11000. However data bit pattern is identified here in [link] 


as a Single error from codeword 11100 as we assume that 1 error is a much 
more likely occurence than two errors! 


These represent some of the double error patterns, which can thus be 
detected here, but they cannnot be corrected as all the possible double error 
patterns do not have a unique representation in [link]. 


Soft decision decoding 


Nearest neighbour decoding can also be done on a soft decision basis, with 
real non-binary numbers from the receiver. The nearest Euclidean distance 
(nearest to these 5 codewords in terms of a 5-D geometry) is then used and 
this gives a considerable performance increase over the hard decision 
decoding described here. 


Hamming bound 


This defines mathematically the error correcting performance of a block 
coder. The upper bound on the performance of block codes is given by the 
Hamming bound, some times called the sphere packing bound. If we are 
trying to create a code to correct t errors with a block length of n with k 
information digits, then [link] shows the Hamming bound equation. 


The upper bound on the performance of block codes is 
given by the Hamming Bound, 


gee 
— Ltnt+"C,+"C,4+...+"C, 


nCj is the number of ways of choosing / 
error positions within a block of length n 
binary digits: 


n! 


i” j\(n-7)! 


nt 


Hamming bound calculation for (n, k) block code to establish number 
of terms which can be included in the denominator and hence arrive at 
the codes error correcting power t 


Here the denominator terms, which are represented by the binomial 
coefficients, represent the number of possible patterns or positions in which 
1, 2, ..., t errors can occur in an n-bit codeword. 


Note the relationship between the decoding table in [link] and the Hamming 
Bound equation in [link]. The 2* = 4 left hand entry represents the number 
of transmitted codewords or columns in the table. The numerator 2” = 32 
represents the total possible number of unique entries in the table. The 
demoninator represents the number of rows which can be accommodated 
within the table. Here the first denominator term (1) represents the first row 
(i.e. the transmitted codewords) and the second term (n) the 5 single error 
patterns. Subsequent terms then represent all the possible double, triple 


error patterns, etc. The denominator has to be sized or restricted to t to 
ensure the inequality and this gives or defines the error correction capability 
as t. 


If the equation in [link] is satisfied then the design of suck an (n, k) code is 
possible with the error correcting power of t. If the equation is not satisfied, 
then we must be less ambitious by reducing t or k (for the same block 
length n) or increasing n (while maintaining t and k). 


Example 2 


Comment on the ability of a (5, 2) code to correct 1 error and the possibility 
of a (5, 2) code to correct 2 errors? 


Solution 


For single error: k = 2,n = 5 and t = 1, leads to the case summarized in 
[link]. 


Can a (5,2) code correct a single error? 


k =2,n=5 and f= 1, leads to: 


5 
2 <2 rs or 4 <5.333 
14+5 6 


Calculation to assess whether (5, 2) block code can correct t = 1 error - 
Answer yes 


which is true so such a code design is possible. 


However if we try to design a (5, 2) code to correct 2 errors we have k = 2, 
n= 5 andt = 2, which is summarized in [link]. 


Attempt to design a (5,2) code which corrects 2 errors 


we have k=2, n=5 and f=2 and in the Hamming Bound: 
5 
ihe a Fn eed or4<2 
1+5+10 6 


which 1s false so such a code cannot be created. 


Calculation to assess whether (5, 2) block code can correct t = 2 errors 
- Answer no 


This result is false or cannot be satisfied and thus this short code cannot be 
designed with a t = 2 error correcting power or capability. 


This provides further mathematical derivation for the error correcting 
performance limits of the nearest neighbour decoding table shown 
previously in [link] where we could correct all single error patterns but we 
could not correct all the possible double error patterns. 


A full decoding table is not required to be created as, through checking the 
Hamming bound, one can identify the required block size and number of 
parity check bits which are required for a given error correction capability 
in a block or group coder design. 


[link] shows the performance of various BLOCK codes, all of rate %, 
whose performance progressively improves as the block length increases 
from 7 to 511, even for the same coding rate of /%. 


The power of these forward error correcting codes (FECC) is quantified as 
the coding gain, i.e. the reduction in the required Fa ratio or energy 


required to transmit each bit divided by the spectral noise density, for a 
given bit error ratio or error probability. 


For example in [link] the (31, 16) code has a coding gain over the uncoded 
case of around 1.8 dB ata PB, of 10°°. 
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Error performance of 1/2 rate block coders with differing block lengths 


Note:This module has been created from lecture notes originated by P M 
Grant and D G M Cruickshank which are published in I A Glover and P M 
Grant, "Digital Communications", Pearson Education, 2009, ISBN 978-0- 
273-71830-7. Powerpoint slides plus end of chapter problem 
examples/solutions are available for instructor use via password access at 
http://www.see.ed.ac.uk/~pmg/DIGICOMMS/ 


Convolutional FECC Encoder 

Convolutional codes form part of forward error correcting coders (FECC). 
They are non-systematic and are generated by passing a data sequence 
through a transversal or finite impulse response (FIR) filter. This module 
provides an example of encoding data with a simple convolutional encoder. 


FECC — % Rate Convolutional Encoder Example 


Convolutional coding 


Convolutional codes are another type of forward error correcting coder 
(FECC) which are quite distinct from block codes. They are simpler to 
implement for longer codes than block coders and soft decision decoding 
can be employed easily at the decoder. 


Convolutional codes are non-systematic (i.e. the transmitted data bits do not 
appear directly in the output encoded data stream) and are generated by 
passing a data sequence through a transversal or finite impulse response 
(FIR) filter. The coder output can be regarded as the convolution of the 
input sequence with the impulse response of the coder, hence their name: 
convolutional codes. 


Convolutional encoder 


A simple example is shown in [link]. Here the encoder shift register starts 
with zeros at all three stored locations (i.e. 0, 0, 0). The input data sequence 
to be encoded is 1, 1, 0, 1 in this example. The shift register contents thus 
become, after each data bit arrives and propagates into the shift register: 
100, 110, 011, 101. As there are two outputs for overy input bit the above 
encoder is rate 2. 


The first output is obtained after arrival of a new data bit into the shift 
register when the switch is in the upper position, the second with the switch 
in the lower position. Thus, in this example, the switch will generate, 


through the exclusive OR gates, from the four input data bits: 1, 1, 0, 1, the 
corresponding four output digit pairs: 11, 10, 11, 01 


~~ 


Cs ey 2nd ops 


Input 1 101 
Output 111011 01 


Shift register contents 100 110 011 101 


4 rate convolutional encoder 


This particular encoder has 3 stages in “the filter” and therefore we say that 
the constraint length n = 3. The very latest encoders available commercially 
typically have constraint lengths up to n = 9. 


We can consider the coder outputs from the exclusive OR gates as being 
generated by two polynomials: 
Equation: 


P,(x) =1+ x 


Equation: 


P2(x) =liz 


These are often expressed in octal notation, in our example: 
Equation: 


P, = 5,(101) 
Equation: 


P» = 6,(110) 


This encoder may also be regarded as a state machine. The next state is 
determined by the next input bit or value combined with the previous two 
input bits or values which were stored in the shift register, (i.e. the previous 
State). 


Tree state diagram 


We can regard this as a Mealy state machine with four states corresponding 
to all the possible combinations of the first two stages in the shift register. 


The tree diagram for this state machine is now shown in [link], again 
starting from the all zeros state or condition. The encoder starts in state A 
holding two zeros (00) within the first two stages of the shift register. (We 
ignore the final stored digit as it is lost when a new data bit propagates into 
the shift register.) If the next input bit is a zero (0) we follow the upper path 
to state B where the stored data is updated to 00. If the next input bit is a 
one (1) we follow the lower path to progress to the corresponding state C 
where the stored data is now 10. 


The convention is to enter the updated new stored state values below the 
state letter (B/C). Now returning to [link] and the exclusive OR gate 
connections one can derive the output data bits generated within the 


encoder. For state B these are 00 and for state C these are 11. These outputs 
are entered alongside the state in [link]. States B/C correspond to the arrival 
of the first new data bit to be encoded, while D/E/F/G correspond to the 
second data bit and H/I/J/K/h/i/j/k the third data bit. 
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Encoded data tree diagram for the encoder of Figure 1 


The tree diagram in [link] tends to suggest that there are eight states in the 
last layer of the tree and that this will continue to grow. However some 
States in the last layer (i.e. the stored data in the encoder) are equivalent as 
indicated by the same letter on the tree (for example H and h). 


These pairs of states may be assumed to be equivalent because they have 
the same internal state for the first two stages of the shift register and 


therefore will behave exactly the same way to the receipt of a new (0 or 1) 
input data bit. 


Trellis state diagram 


Thus the tree can be folded into a trellis, as shown in , which is derived 
from the tree diagram of [link] and [link] encoder. As the constraint length 
is n = 3 we have 2°) = 4 unique states: 00, 01, 10, 11 in [link]. In [link] 
the states are shown as 00x to denote the third bit, x, which is lost or 
discarded following the arrival of a new data bit. 
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Trellis Diagram corresponding to the Tree Diagram of Figure 2 


Note in [link] the horozontal arrangement of states A, B, D, H and L. The 
same applies to states C, E, I and M etc. The horizontal direction 
corresponds to time (the whole diagram in [link] now corresponds to 
encoding 4 input data bits). Here we have dropped the state information 
from [link] as the same states are all represented at the same horizontal 
level in [link]. The vertical direction here corresponds to the stored state 
values a, b, c, d in the encoder shift register. 


States along the time axis are thus equivalent, for example H is equivalent 
to L and C is equivalent to E etc. In fact all the states in a horizontal line are 
equivalent. Thus we can identify only four states in this coder: a, b, c and d 
and the related shift register stored values 00, 10, 01, 11 are shown in the 
left hand side of [link]. 


From any point, e.g. E, if the next input bit is a zero (0) we follow the upper 
path to state J where the stored data is updated to 01 and the output will be 
01. If the next input bit is a one (1) we follow the lower path from E to 
progress to the next state K where the stored data is now 11 and the output 
will be 10 as indicated alongside the trellis path. 


Transition state diagram 


We can draw, if desired, the trellis diagram of [link] in [link] as a state 
diagram containing only these states with all the corresponding new data 
bits to be encoded and the corresponding two output bits generated per new 
input data bit (e.g. 1(10)) 


Input (output 1 output 2) 


State diagram corresponding to the encoder trellis diagram of Figure 3 


Note:This module has been created from lecture notes originated by P M 
Grant and D G M Cruickshank which are published in I A Glover and P M 
Grant, "Digital Communications", Pearson Education, 2009, ISBN 978-0- 
273-71830-7. Powerpoint slides plus end of chapter problem 
examples/solutions are available for instructor use via password access at 
http://www.see.ed.ac.uk/~pmg/DIGICOMMS/ 


Viterbi Decoder 
Explains in simple terms the functions of a convolutional encoder and 
Viterbi decoder 


Viterbi convolutional decoder 


A convolutional code is not decoded in short blocks as in a block code. 
However, to simplify decoding, messages are artificially broken down into 
very long blocks by periodically flushing the encoder with a string of zeros, 
as in the example discussed here. 


For illustration only this example here uses an unrealistically short block 
length of 5 data bits with the last two fixed at 0 to flush the encoder 
(remember that this is very inefficient and, in practice, practical block 
lengths are very much longer, typically 1,000 to 10,000 bits in length). 


Convolutional codes are always decoded using the Viterbi algorithm as this 
simplifies the decoding operation. The algorithm is based on the nearest 
neighbour decoding scheme and, like the other algorithms we have looked 
at, it relies on the assumption that the probability of t errors is much greater 
than the probability of t+1 errors and it thus selects or chooses and retains 
only the paths which have fewer errors. 


The decoding process is based on the previous decoding trellis. We will use 
the previous 2 rate encoder example and assume that the received message 
is: 10 10 00 10 10, representing a total of five (unknown) transmitted data 
bits each encoded into five bit pairs, i.e. total of ten encoded data bits. We 
further assume in this simplified example that the last 2 bits of the 5 data 
inputs were flushing zeros to reset the encoder and decoder. 


Starting (after flushing) with the first received bit in position A in the 
encoder, we know that if a 1 had been input, (lower path) from the encoder 
figure the output should have been 11 as we moved to state C. If a 0 was 
input (upper path) we should have received 00 and moved to state B, see 
upper part of [link]. 


What was actually received was 10, a Hamming distance of 1 from both 
these possibilities, so we draw that in the lower part of [link] onto the first 
stage of our decoding trellis. 
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First stage of trellis after decoding first two received data bits 


Instead of reporting the expected outputs we next annotate the lower part of 
[link] with the separate distances between the received data and the trellis 
encoder on each path. We then add the cumulative Hamming distance to the 
states (B, C) in square brackets above the states B and C 


Now consider the second pair of received data bits. Consider first state B. 
As before, we should have received 00 for a 0 input and 11 for a 1 input, 
see left hand side of [link]. What we actually received was 10, which is a 


Hamming distance of 1 from both possibilities so the right hand part of 
[link] is annotated with the individual and cumulative distances to states D 
and E. 


Then consider state C. For a 0 input, (upper part) we should have received 
01, but what was actually received was 10, a Hamming distance of 2. Fora 
1 input (lower path) we should have received 10 and this is exactly what 
was received, corresponding to a Hamming distance of 0! Again the right 
part of [link] is annotated with the individual distances on the paths and the 
new cumulative or summed distances to states F and G. 
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First and second stage of the decoding trellis after receiving second 
pair of data bits 


We continue to build our decoding trellis until it is complete after receipt of 
all ten data bits, as shown in [link]. 


If we have two paths to a State, as in the later states: H, I , J, K, L, M, N, P, 
we write the smaller (more likely) Hamming distance in square brackets 
above the state and discard the larger distance (as this is much less likely to 
represent the correct path). In our example, we assumed the last two bits 
were 0, so we must expect to finish back in state P, which is the same as the 
Starting state A. 


We finally need to find the path from state A to P which gives the lowest 
overall Hamming distance. We then retrace the path and remember that the 
upper path from a state represented a 0 transmitted and the lower path 
represented a 1 transmitted. 


Decoding trellis for 10 10 00 10 10 received. 


Full decoding trellis after receipt of all ten data bits 


The reverse decoded data for this example is indicated by the dashed line in 
[link]. 


Leaving states A, C, G and K always in the lower of the two possible paths 
implies that a data bit 1 has been received at these states and therefore this 
translates to 1, 1, 1 as the first three encoded data bits. 


The last two bits don’t matter in this case as we have assumed they are 0, 0 
and we can remove from the docoding trellis all the states that don’t support 
or contribute to this solution. 


Note that finishing a block with n-1 zero input data bits is not compulsory. 
If you make a decision after a delay of approximately five times the 
constraint length n, this makes little difference in code performance but 
does limit the memory consumed by the process to a more sensible amount. 


[link] shows the performance of various BLOCK codes, all of rate %, 
whose performance improves as the block length increases, even for the 
same coding rate of %. 


The power of these forward error correcting codes (FECC) is quantified as 
the coding gain, i.e. the reduction in the required re ratio or energy 


required to transmit each bit divided by the spectral noise density, for a 
given bit error ratio or error probability. 


For example in [link] the (31, 16) code has a coding gain over the uncoded 
case of around 1.8 dB at a PB, of 10°°. 
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Error performance of 1/2 rate block coders with differing block lengths 


[link] shows for comparison with the block codes of [link] the performance 
of convolutional coders. The convolutional code initially provides very 
good performance at modest constraint length. A short constraint length of 
n= v= 3 is already superior to the 511 block length code of [link]. The 
additional attraction of the convolutional coder is its further improvement 
with the increase in constraint length up to n = 7 or 9, as shown in [link]. 


Unfortunately the coding and decoding process gets more complicated with 
larger block/constraint length. As shown here convolutional codes with 
Viterbi decoding are generally more powerful than block codes, especially 
for very low error rates, hence their wider use. Single chip constraint length 
9 (512 state) encoder and decoders are now widely available as commercial 
products from many semiconductor vendors. 
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Error rate performance of convolutional decoders with differing 
constraint lengths 


Note:This module has been created from lecture notes originated by P M 
Grant and D G M Cruickshank which are published in I A Glover and P M 
Grant, "Digital Communications", Pearson Education, 2009, ISBN 978-0- 
273-71830-7. Powerpoint slides plus end of chapter problem 
examples/solutions are available for instructor use via password access at 
http://www.see.ed.ac.uk/~pmg/DIGICOMMS/ 


Turbo Coding 
This module provides a brief extension of Viterbi convolutional decoders to 
turbo decoding. 


Turbo encoding and decoding 


Introduction 


A paper was published by Claude Berrou and coauthors at the ICC 
conference in 1993 that rocked or shook the field of forward error 
correction coding (FECC). This described a method of creating much more 
powerful block error correcting coding with only the minimum amount of 
effort. Its main features were two recursive convolutional encoders (RCE) 
interconnected via an interleaver. The data is fed into the first encoder 
directly and into the second encoder after interleaving or reordereing of the 
input data. 


Turbo encoding 


The important features are the use of two recursive convolutional encoders 
and the design of the interleaver which gives a block code with the block 
size equal to the interleaver size, [link]. Random interleavers tend to work 
better than row and column interleavers. Note that recursive convolutional 
encoders were known about well before their use in turbo codes, but the 
difficulties in driving them into a known state made them less popular than 
the non-recursive convolutional encoders described in the previous module. 


The name turbo decoder came from the turbo charger in an automobile 
where the exhaust gasses are used to drive a compressor in a feedback loop 
to increase the input of fuel and hence the vehicles ultimate performance. 


Take every second 


bit from each 
encoder 
(Puncturing) 


Turbo encoder with recursive encoding loops 


The desired output rate was initially achieved by puncturing (ignoring every 
second output) from each of the encoders. 


Turbo decoding 


Turbo decoding is iterative. The decoding is also soft, the values that flow 
around the whole decoder are real values and not binary representations 
(with the exception of the hard decisions taken at the end of the number of 
iterations you are prepared to perform). They are usually log likelihood 
ratios (LLRs), the log of the probability that a particular bit was a logic 1 
divided by the probability the same bit was a logic 0. 


Decoding is accomplished by first demultiplexing the incoming data stream 
into d, yz , yz. d and y; go into the decoder for the first code, [link]. This 
gives an estimate of the extrinsic information from the first decoder which 
is interleaved and past on to the second decoder. The second decoder thus 
has three inputs, the extrinsic information from the first decoder, the 
interleaved data d, and the received values for yz. It produces its extrinsic 
information and this is deinterleaved and passed back to the first encoder. 
This process is then repeated or iterated as required until the final solution 
is obtained from the second decoder interleaver. 
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Turbo decoder 


The decoders themselves generally use soft output Viterbi algorithm 
(SOVA) to decode the received data. However the preferred turbo decoding 


method is to use the maximum a-priori (MAP) algorithm but this is too 
mathematical to discuss here! 


given by iterative decoding (p= 


rate 
of code of fig. 2 (rate:1/2); interleaving 256x256. 


Probability of error for turbo decoders with variable number of 
iterations 


Coder performance 


[link] shows these % rate decoders operating at much lower se or SNR 


values than the convolutional Viterbi decoders of the previous section and, 
further, as the number of iterations increases to beyond 15, then the 
performance comes very very close to the theoretical Shannon bound. 


This is the attraction that has excited the FECC community, who were 
unable to achieve this low error rate before 1993! Now that iterative 
decoding has been introduced for turbo decoders it is also being re-applied 
in low delay parity check (LDPC) decoders with equal enthusiasm and 
SUCCESS. 


Turbo Code 
Example 


[link] includes a turbo decoding example (which as an animated power 
point slide) will show the black dot noise induced errors being corrected on 
each subsequent iteration with the black dots being progressively reduced in 
the upper cartoon. 


Note:This module has been created from lecture notes originated by P M 
Grant and D G M Cruickshank which are published in I A Glover and P M 
Grant, "Digital Communications", Pearson Education, 2009, ISBN 978-0- 


273-71830-7. Powerpoint slides plus end of chapter problem 
examples/solutions are available for instructor use via password access at 
http://www.see.ed.ac.uk/~pmg/DIGICOMMS/ 


